reGenotyper: Detecting mislabeled samples in genetic data
نویسندگان
چکیده
منابع مشابه
reGenotyper: Detecting mislabeled samples in genetic data
In high-throughput molecular profiling studies, genotype labels can be wrongly assigned at various experimental steps; the resulting mislabeled samples seriously reduce the power to detect the genetic basis of phenotypic variation. We have developed an approach to detect potential mislabeling, recover the "ideal" genotype and identify "best-matched" labels for mislabeled samples. On average, we...
متن کاملIdentifying Mislabeled Training Data
This paper presents a new approach to identifying and eliminating mislabeled training instances for supervised learning. The goal of this approach is to improve classiication accuracies produced by learning algorithms by improving the quality of the training data. Our approach uses a set of learning algorithms to create classiiers that serve as noise lters for the training data. We evaluate sin...
متن کاملDetecting outlier samples in microarray data.
In this paper, we address the problem of detecting outlier samples with highly different expression patterns in microarray data. Although outliers are not common, they appear even in widely used benchmark data sets and can negatively affect microarray data analysis. It is important to identify outliers in order to explore underlying experimental or biological problems and remove erroneous data....
متن کاملAn algorithm for correcting mislabeled data
Reliable evaluation for the performance of classifiers depends on the quality of the data sets on which they are tested. During the collecting and recording of a data set, however, some noise may be introduced into the data, especially in various real-world environments, which can degrade the quality of the data set. In this paper, we present a novel approach, called ADE (automatic data enhance...
متن کاملA Robust Boosting Method for Mislabeled Data
Abstract We propose a new, robust boosting method by using a sigmoidal function as a loss function. In deriving the method, the stagewise additive modelling methodology is blended with the gradient descent algorithms. Based on intensive numerical experiments, we show that the proposed method is actually better than AdaBoost and other regularized method in test error rates in the case of noisy, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: PLOS ONE
سال: 2017
ISSN: 1932-6203
DOI: 10.1371/journal.pone.0171324